feat(native): Add support for REST based remote function#23568
feat(native): Add support for REST based remote function#23568tdcmeehan merged 8 commits intoprestodb:masterfrom
Conversation
808f416 to
1c7ea23
Compare
86c1f8b to
3345b51
Compare
0d73f09 to
f530b51
Compare
32a8715 to
5726f00
Compare
a8afff9 to
26f9fe6
Compare
|
Thanks for the release note! Please format it so the automation picks it up, like: |
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @Joe-Abraham
|
@Joe-Abraham : Please also work on a follow-up PR for user documentation on how to setup and use rest functions in Native SQL engine queries. |
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @Joe-Abraham
| RemoteDoubleDivHandler( | ||
| velox::RowTypePtr inputTypes, | ||
| velox::TypePtr outputType) | ||
| : RemoteFunctionRestHandler( |
There was a problem hiding this comment.
This is a bit counterintuitive to me. So the function body assumes this function is for (DOUBLE(), DOUBLE()) -> DOUBLE()) which means that RemoteDoubleDivHandler shouldn't be taking types as input but only calling RemoteFunctionRestHandler with the correct types.
There was a problem hiding this comment.
This handler is hardcoded to work with a signature of (DOUBLE, DOUBLE) -> DOUBLE for testing purposes. While it could be extended to support generic numeric types (e.g., using templates or type checking), since that’s not the primary focus of these tests, we maintain the hardcoded types for simplicity.
There was a problem hiding this comment.
There seems like a misunderstanding here. I feel the code for this and other functions should be as follows since the function knows what types it supports. The client shouldn't have to specify the types in the register functions.
RemoteDoubleDivHandler()
: RemoteFunctionRestHandler(
std::vector<TypePtr> inputTypes = {DOUBLE(), DOUBLE()};
TypePtr outputType = DOUBLE();
std::move(inputTypes),
std::move(outputType)) {}
There was a problem hiding this comment.
Got it, I have made improvements to the RemoteFunctionRestHandler.
| assertEqualVectors(expected, results); | ||
| } | ||
|
|
||
| TEST_P(RemoteFunctionRestTest, removeCharactersFromString) { |
There was a problem hiding this comment.
Nit : Shorten function names : removeChar
| assertEqualVectors(expected, results); | ||
| } | ||
|
|
||
| TEST_P(RemoteFunctionRestTest, inverseCdfInvalidInputsServerThrowsException) { |
There was a problem hiding this comment.
Nit : Shorten inverseCdfException
There was a problem hiding this comment.
Hey there - I've reviewed your changes and they look great!
Prompt for AI Agents
Please address the comments from this code review:
## Individual Comments
### Comment 1
<location> `presto-native-execution/presto_cpp/main/types/tests/RestFunctionHandleTest.cpp:121-116` </location>
<code_context>
+TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithDecimalType) {
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding tests for edge cases with decimal types, such as negative scales, very large precisions, or invalid type strings.
Additional tests for invalid decimal type strings, negative scales, and large precisions will help verify robust parsing and error handling.
Suggested implementation:
```cpp
TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithDecimalType) {
try {
const std::string str = R"JSON(
{
"@type": "RestFunctionHandle",
"functionId": "remote.testSchema.decimalFunction;decimal(10,2);decimal(10,2)",
"version": "v1",
"signature": {
"name": "decimalFunction",
"kind": "SCALAR",
"returnType": "decimal(10,2)",
```
```cpp
} catch (const std::exception& e) {
FAIL() << "Exception: " << e.what();
}
}
// Edge case: negative scale
TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithNegativeScale) {
const std::string str = R"JSON(
{
"@type": "RestFunctionHandle",
"functionId": "remote.testSchema.decimalFunction;decimal(10,-2);decimal(10,-2)",
"version": "v1",
"signature": {
"name": "decimalFunction",
"kind": "SCALAR",
"returnType": "decimal(10,-2)",
"arguments": [
{
"name": "arg0",
"type": "decimal(10,-2)"
}
]
}
}
)JSON";
try {
auto handle = RestFunctionHandle::fromJson(str);
FAIL() << "Expected exception for negative scale, but parsing succeeded.";
} catch (const std::exception& e) {
SUCCEED();
}
}
// Edge case: very large precision
TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithLargePrecision) {
const std::string str = R"JSON(
{
"@type": "RestFunctionHandle",
"functionId": "remote.testSchema.decimalFunction;decimal(1000,2);decimal(1000,2)",
"version": "v1",
"signature": {
"name": "decimalFunction",
"kind": "SCALAR",
"returnType": "decimal(1000,2)",
"arguments": [
{
"name": "arg0",
"type": "decimal(1000,2)"
}
]
}
}
)JSON";
try {
auto handle = RestFunctionHandle::fromJson(str);
// Depending on implementation, this may succeed or fail.
// If large precision is not supported, expect an exception.
// Otherwise, check the type.
auto returnType = handle->signature().returnType();
EXPECT_EQ(returnType->kind(), TypeKind::DECIMAL);
EXPECT_EQ(returnType->asDecimal().precision(), 1000);
EXPECT_EQ(returnType->asDecimal().scale(), 2);
} catch (const std::exception& e) {
// Acceptable if implementation throws for large precision.
SUCCEED();
}
}
// Edge case: invalid type string
TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithInvalidTypeString) {
const std::string str = R"JSON(
{
"@type": "RestFunctionHandle",
"functionId": "remote.testSchema.decimalFunction;decimal(foo,bar);decimal(foo,bar)",
"version": "v1",
"signature": {
"name": "decimalFunction",
"kind": "SCALAR",
"returnType": "decimal(foo,bar)",
"arguments": [
{
"name": "arg0",
"type": "decimal(foo,bar)"
}
]
}
}
)JSON";
try {
auto handle = RestFunctionHandle::fromJson(str);
FAIL() << "Expected exception for invalid decimal type string, but parsing succeeded.";
} catch (const std::exception& e) {
SUCCEED();
}
}
```
</issue_to_address>
### Comment 2
<location> `presto-native-execution/presto_cpp/main/types/tests/RestFunctionHandleTest.cpp:180-116` </location>
<code_context>
+TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithArrayType) {
</code_context>
<issue_to_address>
**suggestion (testing):** Add tests for empty arrays and arrays with nested nulls.
Please include test cases for empty arrays and arrays containing null elements to verify correct handling by the parser.
Suggested implementation:
```cpp
TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithArrayType) {
try {
const std::string str = R"JSON(
{
"@type": "RestFunctionHandle",
"functionId": "remote.testSchema.arrayFunction;array(bigint);array(varchar)",
"version": "v1",
"signature": {
"name": "arrayFunction",
"kind": "SCALAR",
"returnType": "array(bigint)",
```
```cpp
} catch (const std::exception& e) {
FAIL() << "Exception: " << e.what();
}
}
// Test for empty array argument
TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithEmptyArray) {
try {
const std::string str = R"JSON(
{
"@type": "RestFunctionHandle",
"functionId": "remote.testSchema.emptyArrayFunction;array(bigint)",
"version": "v1",
"signature": {
"name": "emptyArrayFunction",
"kind": "SCALAR",
"returnType": "array(bigint)",
"argumentTypes": ["array(bigint)"]
},
"arguments": [
[]
]
}
)JSON";
auto handle = RestFunctionHandle::fromJson(str);
ASSERT_NE(handle, nullptr);
ASSERT_EQ(handle->signature().name, "emptyArrayFunction");
ASSERT_EQ(handle->arguments().size(), 1);
// Check that the argument is an empty array
auto* arrayArg = dynamic_cast<const ArrayArgument*>(handle->arguments()[0].get());
ASSERT_NE(arrayArg, nullptr);
EXPECT_TRUE(arrayArg->values().empty());
} catch (const std::exception& e) {
FAIL() << "Exception: " << e.what();
}
}
// Test for array with nested nulls
TEST_F(RestFunctionHandleTest, parseRestFunctionHandleWithArrayContainingNulls) {
try {
const std::string str = R"JSON(
{
"@type": "RestFunctionHandle",
"functionId": "remote.testSchema.nullArrayFunction;array(bigint)",
"version": "v1",
"signature": {
"name": "nullArrayFunction",
"kind": "SCALAR",
"returnType": "array(bigint)",
"argumentTypes": ["array(bigint)"]
},
"arguments": [
[1, null, 3, null]
]
}
)JSON";
auto handle = RestFunctionHandle::fromJson(str);
ASSERT_NE(handle, nullptr);
ASSERT_EQ(handle->signature().name, "nullArrayFunction");
ASSERT_EQ(handle->arguments().size(), 1);
auto* arrayArg = dynamic_cast<const ArrayArgument*>(handle->arguments()[0].get());
ASSERT_NE(arrayArg, nullptr);
ASSERT_EQ(arrayArg->values().size(), 4);
// Check for nulls in the array
EXPECT_TRUE(arrayArg->values()[1] == nullptr);
EXPECT_TRUE(arrayArg->values()[3] == nullptr);
EXPECT_EQ(*static_cast<const int64_t*>(arrayArg->values()[0].get()), 1);
EXPECT_EQ(*static_cast<const int64_t*>(arrayArg->values()[2].get()), 3);
} catch (const std::exception& e) {
FAIL() << "Exception: " << e.what();
}
}
```
</issue_to_address>
### Comment 3
<location> `presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoContainerRemoteFunction.java:42-43` </location>
<code_context>
+ true);
+ }
+
+ @Test
+ public void testRemoteBasicTests()
+ {
+ assertEquals(
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding tests for remote functions with invalid arguments or unsupported types.
Please include test cases that verify error handling when remote functions receive invalid arguments or unsupported types.
</issue_to_address>
### Comment 4
<location> `presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestPrestoContainerRemoteFunction.java:71-84` </location>
<code_context>
+ }
+
+ @Test
+ public void testRemoteFunctionAppliedToColumn()
+ {
+ assertEquals(computeActual("SELECT remote.default.floor(o_totalprice) FROM tpch.sf1.orders")
+ .getMaterializedRows().size(), 1500000);
+ assertEquals(computeActual("SELECT remote.default.abs(l_discount) FROM tpch.sf1.lineitem")
+ .getMaterializedRows().size(), 6001215);
+ assertQueryWithSameQueryRunner(
+ "SELECT remote.default.abs(l_discount) FROM tpch.sf1.lineitem",
+ "SELECT abs(l_discount) FROM tpch.sf1.lineitem");
+ assertEquals(computeActual("SELECT remote.default.length(CAST(o_comment AS VARBINARY)) FROM tpch.sf1.orders")
+ .getMaterializedRows().size(), 1500000);
+ }
+}
</code_context>
<issue_to_address>
**suggestion (testing):** Consider adding tests for remote functions on empty tables or with null values.
Please include tests for remote functions on empty tables and columns containing nulls to verify correct behavior in these cases.
```suggestion
@Test
public void testRemoteFunctionAppliedToColumn()
{
assertEquals(computeActual("SELECT remote.default.floor(o_totalprice) FROM tpch.sf1.orders")
.getMaterializedRows().size(), 1500000);
assertEquals(computeActual("SELECT remote.default.abs(l_discount) FROM tpch.sf1.lineitem")
.getMaterializedRows().size(), 6001215);
assertQueryWithSameQueryRunner(
"SELECT remote.default.abs(l_discount) FROM tpch.sf1.lineitem",
"SELECT abs(l_discount) FROM tpch.sf1.lineitem");
assertEquals(computeActual("SELECT remote.default.length(CAST(o_comment AS VARBINARY)) FROM tpch.sf1.orders")
.getMaterializedRows().size(), 1500000);
}
@Test
public void testRemoteFunctionOnEmptyTable()
{
// Create an empty table for testing
computeActual("CREATE TABLE test_empty_table (x INTEGER)");
// Query remote function on empty table
assertEquals(computeActual("SELECT remote.default.abs(x) FROM test_empty_table").getMaterializedRows().size(), 0);
// Drop the table after test
computeActual("DROP TABLE test_empty_table");
}
@Test
public void testRemoteFunctionOnNullValues()
{
// Create a table with null values
computeActual("CREATE TABLE test_null_table (y INTEGER)");
computeActual("INSERT INTO test_null_table VALUES (NULL), (NULL), (1), (NULL)");
// Query remote function on column with nulls
List<?> result = computeActual("SELECT remote.default.abs(y) FROM test_null_table").getMaterializedRows();
// There should be 4 rows, 3 nulls and 1 with value 1
int nullCount = 0;
int oneCount = 0;
for (Object row : result) {
Object value = ((com.facebook.presto.spi.Page) row).getField(0);
if (value == null) {
nullCount++;
} else if (value.toString().equals("1")) {
oneCount++;
}
}
assertEquals(nullCount, 3);
assertEquals(oneCount, 1);
// Drop the table after test
computeActual("DROP TABLE test_null_table");
}
}
```
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
|
Thanks for the release note! Formatting nit because the Presto documentation uses .rst and not Markdown format links. |
aditi-pandit
left a comment
There was a problem hiding this comment.
Have a minor nit.
@czentgr, @majetideepak : Please can you do a round of review.
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @Joe-Abraham.
czentgr
left a comment
There was a problem hiding this comment.
Thnaks, one nit to remove some strings with the corresponding existing constants.
| } | ||
|
|
||
| auto msg = fmt::format( | ||
| "Unsupported Content-Type: '{}'. Expecting 'application/X-presto-pages' " |
There was a problem hiding this comment.
Nit: we have constants CONTENT_TYPE_PRESTO_PAGE and CONTENT_TYPE_SPARK_UNSAFE_ROW that can be used here and below.
aditi-pandit
left a comment
There was a problem hiding this comment.
Thanks @Joe-Abraham
during docker build of native execution |
@karthikeyann, I have updated the Jenkinsfile - #26662 |
|
This failure is during coordinator image build locally. |
Description
Adds support of REST based remote function to presto native.
The following features will be added in the subsequent PRs and is not addressed in this PR
Motivation and Context
Impact
Test Plan
Contributor checklist
Release Notes
Co-authored-by: Wills Feng wills.feng@ibm.com